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[57] ABSTRACT 

Topically relevant objects in an object database are first 
identified using any generally known methods to obtain a set 
of topically relevant objects (topically relevant set). Parents, 
and in alternative embodiments other ancestors, of one or 
more of the topically relevant objects are identified accord- 
ing to directional structural relationships that the parents 
have with respect to the topically relevant objects. These 
objects form a set of structurally relevant objects 
(structurally relevant set). In some embodiments, the user 
query identifies one or more of these structural relationships. 
The topically relevant objects are then organized under one 
or more of their respective parents to form a hierarchy level 
of both (topically relevant and structurally relevant) sets of 
objects. In some preferred embodiments, the process can 
iterate to create more than one hierarchy level. 

23 Claims, 11 Drawing Sheets 
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SYSTEM AND METHOD FOR 
HIERARCHICALLY GROUPING AND 
RANKING A SET OF OBJECTS IN A QUERY 
CONTEXT BASED ON ONE OR MORE 

RELATIONSHIPS 5 

FIELD OF THE INVENTION 

This invention relates to the field of searching and navi- 
gating a large object collection, particularly a hypermedia 
database in a networking environment. More specifically, 10 
the invention relates to a system and method for generating 
grouped hierarchical views (with ranking) for a set of 
(hypermedia) objects in a query context based on one or 
more relationships. 

BACKGROUND OF THE INVENTION 

A hypermedia object database is a collection of hyper- 
media objects stored electronically as files on one or more 
computers. Hypermedia objects contain information in the 2Q 
form of text, images, sound, or video. Hypermedia objects 
may also participate in relationships, where a relationship 
identifies one or more hypermedia objects that are related 
somehow (a hypermedia object may be related to itself). One 
common relationship is the hyperlink relationship. Two 25 
hypermedia objects are related by a (directed) hyperlink 
relationship if one of the objects contains a hyperlink pointer 
to the other object. A hyperlink pointer is a reference to a 
hypermedia object that allows the target object to be 
accessed directly from the source object. 3Q 

Users access a hypermedia object database to locate 
objects of interest and retrieve those objects for processing 
(e.g., reading, viewing, listening, analysis). Finding objects 
of interest by manually inspecting every object in a large 
database is impractical. Instead, users typically search the 35 
database for interesting objects using a search system. A 
search system allows a user to express an information need 
in the form of a query. The system's search engine processes 
the query and returns to the user a hit-list of relevant objects. 
The user then selects interesting objects from the hit- list and 40 
retrieves those objects. 

A relational database management system (RDBMS) may 
be used to index and search arbitrary hypermedia objects 
based on their attributes. Attributes include items such as 
size, creation date, author, and title. Searching for objects in 45 
this fashion is well known. In addition to attribute-based 
searching, users may want to search for hypermedia objects 
based on their content. The algorithms and data structures 
used by a content-based search system depend on the kind 
of object being searched. Text objects are typically searched 50 
using an information retrieval (IR) system (e.g., IBM Search 
Manager/2, a trademark of the IBM Corporation). Image 
objects are typically searched using an image indexing and 
retrieval system (e.g., IBM QBIC, a trademark of the IBM 
Corporation). Content-based search techniques for video 55 
and sound exist and have been incorporated into prototype 
systems, but this technology is less mature than text and 
image search. Objects found using an attribute-based or 
content-based search system are said to be "topically rel- 
evant" to the query. 60 

Some prior art content-based search systems attempt to 
improve the search results for hypermedia object databases 
by refining object relevance scores based on the structural 
relationships (e.g., hyperlinks) between the objects. Three 
representative techniques are used by these systems. The 65 
first technique is a form of "spreading activation," where 
object relevance scores are propagated along outbound 


,446 

2 

hyperlink pointers to neighboring objects and used to 
modify the relevance scores of those objects (see Cohen, P. 
R., and Kjeldsen, R. "Information Retrieval by Constrained 
Spreading Activation in Semantic Networks," Information 
Processing & Management, 23(2), pp. 255-268, 1987; 
Savoy, J. "Citation Schemes in Hypertext Information 
Retrieval," in M. Agosti and A. Smeaton (Eds.), Information 
Retrieval and Hypertext, Boston, Kluwer Academic 
Publishers, pp. 99-120, 1996). This procedure is typically 
iterated until a steady-state is reached or some terminating 
condition is met. 

The objects are then sorted by their final relevance scores 
and returned on a flat hit-list (i.e., the hit-list simply enu- 
merates the objects without describing any structural 
relationships). 

In the second technique, it is assumed that the hypermedia 
objects are organized in a given hierarchy, such that every 
object has at most one parent and the children of a given 
object are explicitly identified. An object's relevance score 
is then calculated as a function of its content-based rel- 
evance score and the relevance scores of its children. Rel- 
evance scores must be propagated from the leaves of a 
hierarchy to the root (see Frisse, M. E. "Searching for 
Information in a Hypertext Medical Handbook," Commu- 
nications of the ACM, 31(7), pp. 880-886, 1988). The 
objects are then sorted by their final relevance scores and 
returned on a flat hit-list. 

In the third technique, the content of neighboring objects 
is added to the content of the current object when determin- 
ing the relevance score for the current object (see Croft et al. 
"Retrieving Documents by Plausible Inference: an Experi- 
mental Study," Information Processing & Management, 
25(6), pp. 599-614, 1989). Neighboring objects are those 
objects to which the current object contains hyperlink point- 
ers. As in the previous two techniques, objects are sorted by 
their relevance scores and returned on a flat hit- list. 

The above cited references are incorporated by reference 
in their entirety. 

Regardless of the search technology being used, most 
search systems follow the same basic procedure for indexing 
and searching a hypermedia object database. First, the 
objects to be searched must be input to the search system for 
indexing. Next, attributes and/or contents are extracted from 
the objects and processed to create an index. An index 
consists of data that is used by the search system to process 
queries and identify relevant objects. After the index is built, 
queries may be submitted to the search system. The query 
represents the user's information need and is expressed 
using a query language and syntax defined by the search 
system. The search system processes the query using the 
index data for the database and a suitable similarity ranking 
algorithm, and returns a hit-list of topically relevant objects. 
The user may then select relevant objects from the hit-list for 
viewing and processing. 

A user may also use objects on the hit-list as navigational 
starting points. Navigation is the process of moving from 
one hypermedia object to another hypermedia object by 
traversing a hyperlink pointer between the objects. This 
operation is typically facilitated by a user interface that 
displays hypermedia objects, highlights the hyperlinks in 
those objects, and provides a simple mechanism for travers- 
ing a hyperlink and displaying the referent object. One such 
user interface is a Web browser (see below). By navigating, 
a user may find other objects of interest. 

In a networking environment, the components of a hyper- 
media object database system may be spread across multiple 
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computers. A computer comprises a Central Processing Unit ignore relationships altogether) that conveys no information 

(CPU), main memory, disk storage, and software (e.g., a about the relationships between the objects. Those prior art 

personal computer (PC) like the IBM ThinkPad). (ThinkPad systems that can display a hierarchically structured search 

is a trademark of the IBM Corporation.) A networking result require that a single, pre-determined hierarchy be 

environment consists of two or more computers connected 5 specified for all of the objects in the hypermedia database, 

by a local or wide area network (e.g., Ethernet, Token Ring, For large databases (e.g., the World Wide Web), this require- 

the telephone network, and the Internet,) (See for example, ment is impossible to satisfy, and in general, this require- 

U.S. Pat. No. 5,371,852 to Attanasio et al. issued on Dec. 6, ment is restrictive and inflexible. 

1994 which is herein incorporated by reference in its n g pr i 0 r art system fully exploits the relationships 

entirety.) A user accesses the hypermedia object database 10 between objects in a hypermedia object database to 1) 

using a client application on the user's computer. The client produce a search result that includes both topically and 

application communicates with a search server (the hyper- structurally relevant objects ("structurally relevant" objects 

media object database search system) on either the user's are objects that may not be topically relevant, but are still 

computer (e.g. a client) or another computer (e.g. one or relevant to the query due to their structural relationships 

more servers) on the network. To process queries, the search 15 with topically relevant objects), and 2) present the search 

server needs to access just the database index, which may be results such that the end user can easily identify and exploit 

located on the same computer as the search server or yet the relationships between the objects on the hit- list, 

another computer on the network. The actual objects in the The ior aft forms eveQ WQrse m a networked multi _ 

database may be located on any computer on the network. usef environmenl> e . g . the World Wide Web, where docu- 

A Web environment, such as the World Wide Web on the 20 men ts are created by multiple authors and are poorly cross 
Internet, is a networking environment where Web servers, referenced. For example, when a hypertext document is 
e.g. Netscape Enterprise Server and IBM Internet Connec- created by a single author, its pages are typically well cross 
tion Server, and browsers, e.g. Netscape Navigator and IBM referenced and provide links to next, previous, parent, index, 
WebExplorer, are used. (Netscape Navigator is a trademark anc j table of contents pages. Even though a prior art search 
of the Netscape Communications Corporation and WebEx- 25 system ignores these links and identifies only topically 
plorer is a trademark of the IBM Corporation.) Users can relevant pages in the document, since the document is well 
make hypermedia objects publicly available in a Web envi- cross referenced, the user can navigate to structurally rel- 
ronment by registering the objects with a Web server. eva nt pages in the document by following the cross refer- 
Moreover, users can create arbitrary relationships between ence However, the user does not have this option in a 
these objects, even if the objects were created by another 30 networked multi-user environment where a second, inde- 
user. Other users in the Web environment can then retrieve pendent author could create a related document with hyper- 
these objects using a Web browser. The collection of objects text links to the document created by the first author. A 
retrievable in a Web networking environment can be con- relationship now exists between these two documents, but 
sidered as a large hypermedia object database. trjer e are no good cross references from the first document 

To create an index for a hypermedia object database in a 35 to the second document, preventing the user from navigating 

Web networking environment, the prior art often uses Web to the second document. If the second document is not 

crawlers, also called robots, spiders, wanderers, or worms topically relevant to the user's query, the user will fail to find 

(e.g., WebCrawIer, WWWWorm), to gather the available the second document even though it is structurally relevant 

objects and submit them to the search system indexer. Web to the query due to its relationship to the first document, 

crawlers make use of the (physical) hyperlinks stored in 40 

objects. All of the objects are gathered by identifying a few OBJECTS OF THE INVENTION 

key starting points, retrieving those objects for indexing, A ,. A f jL . . . 4 , ' , , A 

/. . K -j it u- * e j u *u u- 7 An object of this invention is a system and method that 

retrieving and indexmg all objects referenced by the objects * t_- u * i • r . • « j * 

-i *■ • * i generates a hierarchical grouping of topically and structur- 

mst indexed (via hyperlinks), and continuing recursively & n . : 

n , ;\, f \ u , ■ t u u 45 ally relevant objects in a query context, 

until all objects reachable from the starting points have been J J i / 

retrieved and indexed. The graph of objects in a Web ^ ob J ect of thls invention is a system and method that 
environment is typically well connected, such that nearly all generates a ranked, hierarchical grouping of topically and 
of the available objects can be found when appropriate structurally relevant objects in a query context, 
starting points are chosen. An object of this invention is a system and method that 
Having gathered and indexed all of the objects available 50 generates a hierarchical grouping of relevant objects in a 
in the Web environment, the index can then be used, as 1 uerv 00016x1 based on one or more structural relationships, 
described above, to search for objects in the Web. Again, the An object of this invention is a system and method that 
index may be located independently of the objects, the generates ranked, hierarchical groupings of topically and 
client, and even the search server. A hit-list, generated as the 55 structurally relevant objects in a query context in a network- 
result of searching the index, will typically identify the ing environment. 

locations of the relevant objects on the Web, and the user An object of this invention is a system and method that 

will retrieve those objects directly with their Web browser. generates a hierarchical grouping of topically and structur- 
ally relevant objects in a query context by: identifying 

STATEMENT OF PROBLEMS WITH THE ^ objects that are structurally relevant, organizing structurally 

PRIOR ART anc j topically relevant objects into meaningful groups, iden- 


When searching a hypermedia object database, the prior navigational starting points, and presenting the 

art fails to create a hierarchically structured search result g rou P s to a user m a meaningful way. 

based on arbitrary relationships between the objects in the SUMMARY OF THE INVENTION 

database. Some prior art systems consider relationships 65 

when ranking objects, but the final result in most of these The present invention is a system and method for iden- 

systems is still a simple hit-list (comparable to systems that tifying and hierarchically grouping one or more 
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(hypermedia) objects that are topically relevant to a user FIG. 13 is a screen dump showing a sample output result 

query and/or have a structural relation with one another. of the present invention. 

Topically relevant objects are first identified using any 

generally known methods to obtain a set of topically rel- DETAILED DESCRIPTION OF THE 

evant objects (topically relevant set). Parents, and in alter- 5 INVENTION 

native embodiments other ancestors, of one or more of the piG. 1 is a block diagram of the computing environment 

topicaUy relevant objects are identified according to direc- m wn ich the present invention is used in a non limiting 

tional structural relationships that the parents have with preferred embodiment. The figure shows some of the pos- 

respect to the topically relevant objects. These objects form sible hardware, software, and networking configurations that 

a set of structurally relevant objects (structurally relevant 10 make up tne comp uting environment. 

set). In some embodiments, the user query identifies one or t, mo * 

7 _ , , , ■ , ■ . „ . 1 he computing environment or system 100 comprises one 

more of these structural relationships. The topically relevant Qr mofe L com p U ,ers 170, 175, 180, 185, 190, 

objects are then organized under one or more of their and 195 b , network 105 E |es of 

respective parents to form a hierarchy level of both general purpose computers include the IBM Aptiva personal 

topically relevan and [structurally relevant) sets of objects. is the , BM R , sc s tem/6000 workstation , ^ the 

In some preferred embod.ments the process can iterate to , BM pQWER parallel SP2. (These are Trademarks of the 

create more than one hierarchy level. TT >» , n v ™ n . i me u 11 

J IBM Corporation.) The network 105 may be a local area 

In alternative embodiments, subsets of the topically rel- network (LAN), a wide area network (WAN), or the Internet, 

evant set can be selected, e.g., by rank, as the topically Moreover, the computers in this environment may support 

relevant set. Also, subsets of the structurally relevant set, 20 the Web information exchange protocol (HTTP) and be part 

e.g., by weightmg, can be selected as the structurally rel- of a local Web or lhe World wide Web (www). Some 

evant set. For example, the weight can be based on the* computers (e.g., 195) may occasionally or always be dis- 

number of hyperlink paths that connect the ancestor to one- connected 196 from the network and operate as stand-alone 

or more of the topically relevant objects, the strengths of computers. 

these hyperlink paths, and other attributes of the ancestor, 25 IUt ^ „j*« a w-«» i^n ~ u u 1 *• 1 

J \ \ \ - , , .... , ' Hypermedia objects 140 are items such as books, articles. 

such as the total number of hyperlinks originating at the „ • . _ _ . M . . . ^ 

t , . , , Jy r , & & reports, pictures, movies, or recordings that contain text, 

ancestor and the topical relevance of the ancestor. J™ m « ..a- *u i? a- * ai 

* images, video, audio, or any other multimedia object and/or 

BRIEF DESCRIPTION OF THE DRAWINGS information. One or more hypermedia objects are stored on 

_ c . 30 one or more computers in the environment. 

The foregoing and other objects aspects and advantages Tq fln(J a icukr h dia object in tne environment) 

will be better understood from the following detailed , (se / nG ?A) is ™ bmiUed fo J f processing t0 a t0 ical 

descnption of preferred embodiments of the invention with l • 1 m • * - fu • 

c .1 j . *u . • 1 j *u * 11 • search engine 120 running on a computer in the environ- 

reference to the drawings that include the following: 4T . , ■ , . . . , «n* -j *-r 

e ment. The topical search engine uses an index 130 to identify 

FIG. lis a block diagram of the computing environment 35 hv p erm edia objects that are relevant to the query. The 

in which the present invention is used in a non limiting topical ^ aTch engine creates an index by indexing a par . 

preferred embodiment. ticular ^ of hypermedia objects in the environment, called 

FIG. 2, comprising FIGS. 2A and 2B, is a block diagram a hypermedia object database 141. A hypermedia object 

of an example object collection, in particular a hypermedia database 141 may comprise hypermedia objects located 

object database. 40 anywhere in the computing environment, e.g., spread across 

FIG. 3 is a block diagram of an example hypermedia two or more computers. The relevant hypermedia objects 

object database that is hierarchically grouped by the present identified by the index are ranked and returned by the topical 

invention. search engine in the form of a hit-list (see FIG. 7B). The 

FIG. 4 is a block diagram of an example hypermedia process is well known in the prior art. Examples of topical 

object database that is hierarchically grouped as defined by 45 search engines include Search Manager/2 (a trademark of 

a first structural relationship. the IBM corporation.) 

FIG. 5 is a block diagram of an example hypermedia The result of the search is further processed by submitting 

object database that is hierarchically grouped as defined by the query and topical search engine results to a novel 

a second structural relationship. hierarchical view generator 110. The hierarchical view gen- 

FIG. 6, includes prior art FIGS. 6A and 6B, where FIG. 50 erator 110 uses structural relationships in the index 130 (see 

6A shows an object catalog, and FIG. 6B a table of named FIG * 8 ) t0 analyze (parent selection and ranking) the rela- 

attribute values for objects in the catalog. tionships in the database and improve the search result. The 

FIG. 7 comprises FIGS. 7Aand 7B, where FIG. 7Ashows relationships are stored in the index by the hier- 

a structured query and FIG. 7B shows an object hit-list. arc ^ lc f 1 view ^ nt f tor al time. In some preferred 

0 , 1 . . c , 55 embodiments, the hierarchical view generator selects struc- 

FIG. 8 shows a relauonship catalog and several of the tural i y relevant objects by calculating parent ranks based on 

relationship tables to which the relationship catalog refers. ^ ^ Qf lh& tQpicaUy Q ^ (generated by the 

FIG. 9 a children table. topical search engine ) and/or the (weighted) structural rela- 

RG. 10 is a flow chart showing the steps of one preferred tionships in which the topically relevant objects participate, 

hierarchical view generator process executed by the present 60 Structurally relevant objects have a structural relationship 

invention. ( see pjQ w j m one or more 0 f tbe topically relevant 

FIG. 11 is a flow chart showing the steps of a preferred objects. (Structurally relevant objects may or may not be 

process for scoring parent objects based on structural rela- topically relevant.) The hierarchical view generator then 

tionships. aggregates topically relevant objects based on their relation- 

FIG. 12, comprising FIGS. 12Aand 12B, are flow charts 65 ships and generates ranked hierarchies of both the topically 

showing the steps of a preferred process for displaying the relevant objects and structurally relevant objects to present 

results in a ranked hierarchical order. to the user. For convenience, the topical search engine 120 
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and hierarchical view generator 10 are shown here as Another possible configuration is computer 170, a typical 

separate components. Note, however, that both systems may web server. Queries are entered at another workstation (e.g., 

be internal components of a general hypermedia object 175, 180, 185, or possibly 195) or a client station (e.g., 190) 

search system. an d se n t f or processing to the web server 170 via the 

Hypermedia objects 140 and/or indexes 130 on one 5 network 105. The web server 170 uses a remote topical 

computer may be accessed over the network by another search engine 120, hierarchical view generator 110, and 

computer using the Web (http) protocol, a networked file m d ex 130 (accessed via the network 105) to process the 

system protocol (e.g., NFS, AFS), or some other protocol. query Alternatively, one or more of these functions (110, 

Services on one computer (e.g., topical search engine 120) 120 ^ 130) can reside 0Q the web 170 The rcsuUs 

may be invoked over the network by another computer using 3Q are retumed tQ the workstation or client station from which 

the Web protocol, a remote procedure call (RPC) protocol, ^ wag ori inall sent 

or some other protocol. n 3 to J 

A number of possible configurations for accessing hyper- FIGS * 2 and 3 8 ive an intuitive description of the current 
media objects, indexes, and services locally or remotely are invention. The current invention operates on an object 
depicted in the present figure. These possibilities are collection, which consists of objects and directed relation- 
described further below. 15 sm P s between those objects. (The hypermedia object data- 
One configuration is a stand-alone workstation 195 that base 141 in mG - 1 * an example of an object collection.) An 
may or may not be connected to a network 105. The ^J 4 * 1 fc an identifiable entity that typically contains data, 
stand-alone system 195 has hypermedia objects 140 and an called the object's content. The nature of this data depends 
index 130 stored locally. The stand-alone system 195 also on the kind of object. For example, a document object would 
has a topical search engine 120 and hierarchical view 20 contain text data, while an image object would contain 
generator 110 installed locally. When the system is used, a image data. 

query is input to the workstation 195 and processed by the Adirected relationship describes how objects in the object 

local topical search engine 120 and hierarchical view gen- collection are related to each other. Adirected relationship R 

enter 110 using the index 130. The results from the topical consists of a set of mstances of R> where aa instance of 

search engme are output by the workstation 195. 25 relationship R> deQOted r( 0 i, 0 2), indicates that relationship 

A second configuration !S 185, a workstation with hyper- R holds from object ol lQ object q2 ^ relationshi 

media objects and indexes connected to a network 105. Tte instance optl onally have a weight associated with it, 

configuration is similar to the stand-alone workstation 195, R(o i, o2 ,w), where w is the weight of the instance of 

except that 185 is always connected to the network 105. 7 ** u- t» l * i_- * * T * ■% ^ 

Alsof the local index 130 may be derived from local hyper- 30 n £uoi*top ob J ects f and ° 2 ' If R ( ol ^,* hen f 

media objects 140 and/or remote hypermedia objects ^ ol ; M f a P^ ent °! ob J ec J ° 2 ' and ° 2 * ! a cmld of 

accessed via the network 105, and created by either a local oL If ob J ect ° 2 can be reached from ob J ect ol b V 

topical search engine 120 and hierarchical view generator following one or more mstances of relationship R, then there 

110 or a remote topical search engine 120 and/or a remote 15 a P ath m relationship R from object ol to object o2. The 

hierarchical view generator 110 accessed via the network 35 wei S ht of a P alh * a function of, e.g. the sum total, of the 

105. When queries are input at the workstation 185, they weights of the relationship instances that make up the path, 

may be processed locally at 185 using the local topical A directed relationship defines a structural organization of 

search engine 120, local hierarchical view generator 110, the ob j ects in the collection, therefore a directed relationship 

and local index 130. Alternatively, the local topical search ^ aIso called a structural relationship, 

engine 120 and hierarchical view generator 110 may access 40 Note that an undirected relationship is supported by 

a remote index 130 (e.g. on system 175) via the network 105. representing it as a directed relationship that holds in both 

Alternatively, the workstation 185 may access a remote directions, e.g., undirected relationship U between object ol 

topical search engine 120 and/or hierarchical view generator and object o2 would be represented by two instances of the 

110 via the network 105. directed relationship U\ e.g., U'(ol,o2) and U'(o2,ol). 

Another possible configuration is 175, a workstation with 45 An example of a directed relationship is the hyperlink 

an index only. Computer 175 is similar to computer 185 with relationship H. If object ol contains a hyperlink pointer to 

the exception that there are no local hypermedia objects 140. object o2, then H(ol,o2). Other examples of directed rela- 

The local index 130 is derived from hypermedia objects 140 tionships include the subclass or categorization relationship, 

accessed via the network 105. Otherwise, as in computer the geographic location relationship, the location within a 

185, the index 130, topical search engine 120, and hierar- 50 file system relationship, the conceptual relationship, and the 

chical view generator 110 may be accessed locally or is-a-part-of relationship. (Note that this invention applies to 

remotely via the network 105 when processing queries. all sets of general objects with directed relationships (or 

Another possible configuration is computer 180, a work- "bi-directional" representations of undirected relationships), 

station with hypermedia objects only. The hypermedia i.e. structural relationships. In many parts of this disclosure, 

objects 140 stored locally at computer 180 may be accessed 55 these general objects are referred to as hypertext objects 

by remote topical search engines 120 and hierarchical view with structural relationships being hyperlink relationships, 

generators 110 via the network 105. When queries are This is done for convenience without loss of generality.) 

entered at computer 180, the topical search engine 120, FIG. 2, comprising FIGS. 2A and 2B, is a block diagram 

hierarchical view generator 110, and index 130 must all be of an example object collection, in particular a hypermedia 

accessed remotely via the network 105. 6 o object database 50 comprising a set of hypermedia objects, 

Another possible configuration is computer 190, a client typically A-J, with hyperlinks, typically 15, between them, 
station with no local hypermedia objects 140, index 130, The hyperlinks define a structural relationship on the hyper- 
topical search engine 120, or hierarchical view generator media objects in the database. Note that this is just one 
110. When queries are entered at computer 190, the topical possible example object collection. Many other kinds of 
search engine 120, hierarchical view generator 110, and 65 object collections exist with many different kinds of objects 
index 130 must all be accessed remotely via the network and many different relationships, all of which are within the 
105. domain of the current invention. FIG. 2A shows the initial 
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state of the hypermedia object database. FIG. 2B depicts the 
database after a topical search engine 120 has been used to 
search the database. The topical search engine analyzes the 
contents of the hypermedia objects to identify the topically 
relevant objects, e.g. C-G, which are shaded in FIG. 2B. At 
this point, generally known topical search engines 120 
would present this result to the user by simply reporting 
objects C-G in relevance rank order on the hit-list. 

FIG. 3 is a block diagram of the same example hyper- 
media object database shown in FIG. 2A, with the addition 
of an example search result (differently shaded) produced by 
the current invention (the hierarchical view generator 110). 
When searching the hypermedia object database 5, topically 
relevant objects, C-G, are first identified using any general 
topical search engine, as shown in FIG. 2B. 

Then, the hierarchical view generator identifies parent 
objects, e.g. A and H, of the topically relevant objects, C-G, 
by examining the (hyperlink) structural relationship. In one 
preferred embodiment, the hierarchical view generator ranks 
each parent based on the number of the hyperlink paths from 
the parent to the topically relevant objects and sometimes 
factors in a function of the weights of those paths. In some 
embodiments, the hierarchical view generator includes only 
the top ranking parents in the search result. The (top ranking) 
parents are structurally relevant objects, i.e., objects that 
may not be topically relevant, but are still relevant to the 
query due to their structural relationships with topically 
relevant objects. Note that topically relevant objects may 
also be structurally relevant. 

Thus the invention selects from the database of objects a 
set of one or more topically relevant objects (the topically 
relevant set, C-G) and a set of one or more structurally 
relevant objects (the structurally relevant set, A and H). Here 
the structurally relevant set includes objects that are parents 
of the objects in the topically relevant set. Note that the 
parents (optionally the top ranking parents) are structurally 
relevant objects, i.e., objects that may not be topically 
relevant, but are still relevant to the query due to their 
structural relationships with topically relevant objects. Note 
that topically relevant objects may also be structurally 
relevant. Further note that structurally relevant objects do 
not have to have a parent structural relationship to one or 
more topically relevant objects, but can have any ancestral 
relationship defined by the path to the topically relevant 
object(s). Also, one or more structural relationships can be 
used. (See below for a more detailed description.) 

The identification of structurally relevant objects by the 
hierarchical view generator provides a number of benefits. 
First, relevant objects that were missed by the topical search 
engine are found by the hierarchical view generator and 
returned to the user. The topical search engine finds only 
those objects whose content matches the particular query 
terms selected by the user. In the current hypermedia object 
database example, additional (structurally) relevant objects 
can be found by considering the hyperlink pointers incident 
on the topically relevant objects. Hyperlink pointers are 
typically used to cite related objects, direct readers to related 
objects, or provide a path for reading the different compo- 
nents of a hypermedia object. Therefore, the more hyperlink 
pointers an object has to the set of topically relevant objects, 
the more likely it is that the object is relevant to the query. 
Moreover, if an object points to many topically relevant 
objects, that object is likely to contain a more general 
discussion of the query topic. 

Second, the structurally relevant objects found by the 
hierarchical view generator are good navigational starting 


10 


15 


20 i, 


25 


30 


35 


40 


45 


50 


55 


60 


65 


points. Good navigational starting points are those objects 
from which many topically relevant objects can be reached 
by following one or more directed relationship paths from 
the starting point. In FIG. 3, structurally relevant objects A 
and H have been shaded, and their hyperlinks to the original 
set of topically relevant objects (C-G) have been empha- 
sized with a different shading, typically 25. It can be seen 
from the hyperlink structure that by starting with either 
object A or H, all of the topically relevant objects (C, D, E, 
F, and G) in the database can be reached by following a 
single hyperlink. Because objects A and H together provide 
hyperlink pointers to all of the topically relevant objects, 
they are good navigational starting points for browsing the 
hypermedia object database given the current interest of the 
user. 

Third, using the structurally relevant objects, the hierar- 
chical view generator organizes the relevant objects into 
meaningful groups or clusters. Groups are formed based on 
the following observation: a parent object often contains 
information on a particular theme, so its children are likely 
to share that common theme. In FIG. 3, parent objects A and 
H each provide a grouping or clustering of the topically 
relevant objects. Object A contains hyperlink pointers to 
objects C, D, and E, while object H contains pointers to 
objects C, G, and F, such that the topically relevant objects 
are divided into two (overlapping) clusters. It can be inferred 
that objects C, D, and E are related by the topic of object A, 
and objects C, G, and F are related by the topic of object H. 
Therefore, it is productive to organize the topically relevant 
objects into these two clusters and present this organization 
to the user. 

Moreover, the groups created using the structurally rel- 
evant objects form a hierarchy of structurally and topically 
relevant objects. In FIG. 3, the dashed hyperlinks 25 orga- 
nize the topically and structurally relevant objects into a 
hierarchy, which is much more meaningful to the user than 
an arbitrary graph, which is all that is known before the 
hierarchical view generator 110 executes. This hierarchical 
presentation shows only those objects that are structurally 
and topically relevant to the user query. The objects in this 
hierarchy can be ranked further if necessary. 

Note that by defining different structural relationships in 
the object collection (see FIG. 8), different paths exist and 
therefore, different sets of (parent) objects, i.e., different 
structurally relevant sets, are identified for any given set of 
topically relevant objects. FIGS. 4 and 5 explore this further. 

FIG. 4 is a block diagram of the same example hyper- 
media object database 50 shown in FIG. 3, with the excep- 
tion that an alternate result that might be produced by the 
current invention using a different relationship, e.g., geo- 
graphic location, is highlighted. For clarity, the hyperlink 
pointers 15 between the objects A-J are not shown. Here, the 
objects participate in geographical (structural) relationships, 
typically X and Y. All objects that participate in the X 
relationship (i.e., are assigned to the geographical location 
described by X) are connected to X by structural links 41. 
All objects that participate in the Y relationship (i.e., are 
assigned to the geographical location described by Y) are 
connected to Y by structural links 42. 

After the topically relevant objects C-G are identified by 
a topical search engine, the hierarchical view generator 110 
uses the geographical relationship links 41 and 42 to identify 
the structural geographical relationship parents X and Y. X 
and Y can then be used to organize the search results into a 
hierarchy where topically relevant objects are grouped by 
geographic location. 
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FIG. 5 is a block diagram of the same example hyper- Engine 120 or processing by Hierarchical View Generator 

media object database 50 shown in FIG. 3, with the excep- 110, i.e., the topically relevant set and the structurally 

tion that an alternate result that might be produced by the relevant set(s). Each row contains fields for the object's 

current invention using a different relationship, e.g., Number in the list 360, its computed Score/Rank 365 

category, is shaded. For clarity, the hyperlink pointers 15 5 (assigned by the topical search engine 120 or the Hierarchi- 

between the objects A-J are not shown. Here, the objects cal view Generator 110 (see FIG. 11)), its Object Id 370, a 

participate in categorization relationships, typically M-R. Selector 375 (used to mark entries for future processing), 

All objects that participate in a given categorization rela- * nd ? P°^j"L a . J able 380 ( see FIG " 9 )- ^ 

tionship (i.e., are assigned to that category, possibly with ChlldreD field 380 15 onl y used when me 0b J ect Hlt ' hst 350 

some confidence weight) are connected to that category by 10 15 c ™ ted ^ the Hierarchical View Generator 110, or only 

structural links 51 u objects in the structurally relevant set(s). 

»u * ■ ii i * u- * ^ o j **c j u FIG. 8 is one preferred set of novel data structures in 

After the topically relevant objects C-G are identified by • * ^™ j j •* , , , , . 

t f - t Li. u'i- . index 130 used to describe and catalog structural relation- 

a topical search engine, the hierarchical view generator uses n , . , 

+ , t . , ;. u - i- i d 4 -j , ships in the object collection and implement the present 

the categorization relationship links 51 to identify category . r . „ , J c .. . . , . . c r . 

i u* * xt n j>> i- u * • ii k invention. By definition, a relationship R is a set of ordered 

relationship parents N, P, and R. Even though some topically 15 . , /* • i \ r 4 t_ * / + <•» \ i_ ■* j <-» 

relevant objects participate in multiple categorization tu P les , Pl es > of the fo ™ ° 2 ! °? "X 

relationships, the structural links 51 with confidence weights ^ < a ° d w ,' s an °Pt™al weight.) The tnple (ol, o2 w) 

are used as evidence to identify N, P, and R as the parents. m K d,c f! tha ! ,he . » "tio^ip R exists from object ol to 

N, P, and R can then be used to organize the search results objec o2 wi h weight w bor example, if re ationsbp R .s the 

into a hierarchy where topically relevant objects are grouped 20 relationship, (ol, o2, w) indicates that object ol 

, . / . r J . t , r , , % f. . contains a hyperhnk pointer to object o2 with weight w. In 

by category. In particular, the structural links used for this - A t ,. J . , 1 ^ t J . . r • • . 

3 . to ' i . * | . i . j , i j i * i m • f-i j c FIG. 2A, this is shown as a structural link 15 from obiect ol 

grouping are highlighted as dashed links 52 in FIG. 5. t . • \ i A n .•^ ^CT^ ^ . . 

& r & ° & to object o2. Alternatively, if R is the categorization 

In subsequent figures, a number of data-structures are relationship, (ol, o2, w) indicates that object o2 belongs to 

described as tables. This is for convenience of drawing and the category ol with weight w . In FIG. 5, this is shown as 

description. In an actual implementation, any usual data- a structural link 51 from cate gory ol to object o2. The 

structure such as a normal array, an associative array, a Relationship Catalog 410 is a table containing rows 415 for 

linked list, a hash table, or any other structure may equiva- the purpose of cata ioging the relationships in the object 

lently be used without affecting the invention described collections and determining which particular relationships 

herein. Note, however, that the tables described below can are to te used to process the current query ^ fields of the 

be readily implemented in any general relational database rows of Se i e ctor 420, Relationship Name 425, and 

management system (RDBMS). Relationship Table Pointer 435. A Relationship Table 450 

FIG. 6, comprising FIGS. 6 A and 6B, is one preferred set contains rows 455 showing how objects are related to each 
of prior art data structures in the index 130 used to impie- other by the current relationship. The fields of the rows 455 
ment the present invention. The Object Catalog 210 in FIG. 35 consist of From-Id 460, To-Id 465, and optional Weight 470, 
6 A stores information about each of the objects in the the triple above. Relationship Table 450 also has a Selector 
collection. The Object Catalog is a table that contains an filed 473 used for marking selected entries 455 for later 
entry 215 for each object in the collection. Each entry 215 processing. The Relationship Catalog is generated at data- 
contains the following information, represented by columns base indexing time and is part of the index 130. 
in the table 210: sequence Number 220 (the number or offset 4Q FIG. 9 is one preferred data structure used to implement 
of the current entry in the table), Object Id 225 (a unique tne present invention. FIG. 9 shows Children Table 580 
identifier for the object that corresponds to the current consisting of entries 590, where each entry has an Object 
entry), Object Pointer 230 (a reference to the object corre- Hit-list identifier field 595, a Child Number field 585, and a 
sponding to the current entry that allows the object to be Relation field 575. The Object Hit-list identifier 595 iden- 
retrieved), object Title 235 (the title of the object corre- 45 t ifies the Object Hit-list 350 that contains an entry for the 
sponding to the current entry) and a set 240 of Attributes 245 current child object. The Child Number 585 is an index into 
associated with the object corresponding to the current entry. t he Number field 360 (of the Object Hit-list 350 identified in 
In a preferred embodiment, each Attribute 245 is a pointer 595) identifying which entry 355 corresponds to the current 
to an Attribute Table 250, shown in FIG. 6B. Using a table child object. The Relation field 575 corresponds to Relation 
250 permits an attribute to be a multi-dimensional value. 5Q Name 425 in the Relationship Catalog 410, and identifies the 
Each dimension is given a Name 255 and a Value 260. relationship by which the child belongs to the corresponding 
Therefore, table 250 has a pair of Name 255 and Value 260 parent. 

fields for each dimension of the attribute. These data struc- FIG. 10 is a flowchart showing the method steps of one 

tures are part of the index 130 and their contents would be p re f er red process executed by the present invention. By 

filled in when the object collection is indexed. 55 executing the process 600, the system 110 produces ranked 

FIG. 7, comprising FIGS. 7A and 7B, is one preferred set hierarchical views for a set of objects in a query context. In 

of data structures used to implement the present invention. the descriptions of processes 600 and 700 that follow, the 

In FIG. 7A, the data structure 310 represents a Query convention is used that if a numerical label is used to 

arranged in an array or list as a sequence of n Query Factors describe a quantity of like objects, such as rows in a table, 

315. Each Query Factor 315 consists of a Query Element 60 and if it is necessary to indicate one specific such instance, 

320, an optional Weight 325 and an optional Connector 330. then the numerical label is used with an alphabetic sufiBx, 

The Query also contains a list 340 of zero or more relation- resulting in descriptors such as 123a, for example, 

ships 345 (see FIG. 8) to be used by the Hierarchical View 7^ process begins with a Query 310 entered by the user 

Generator 110 (see FIG. 10). in step 60 5 and an Object Hit-list 350 of objects identified 

In FIG. 7B, a novel Object Hit-fist, 350 is used to identify 65 by the user in step 610. In a preferred embodiment the 

in rows 355 the objects out of Object Catalog 210 that have Object Hit-list 350, which identifies a subset of the objects 

been selected as the result of a search by Topical Search in Object Catalog 210, is the hit-list of topically relevant 
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objects (the topically relevant set) generated by a topical 
search engine with Query 310 on Object Collection 141 (i.e., 
a content-based search). In step 615, the objects listed in 
Object Hit-list 350 are ranked with respect to Query 310, 
with resulting object ranking values placed in Rank field 5 
365. In a preferred embodiment, the object Ranks 365 are 
supplied with the Object Hit-list 350 in step 610, so ranking 
of the topically relevant objects by the present system is not 
necessary and step 615 is omitted. 

Next, the system enters loop 635 where each iteration of 10 
the loop builds the next higher level of the result hierarchy. 
The loop is controlled by step 635, which tests whether or 
not to build another level of the hierarchy. If another level 
of the hierarchy should be built, branch 636 is taken. If the 
hierarchy building process is complete, branch 637 is taken. 15 
The test in step 635 may be performed in a variety of ways. 
In a highly interactive system, the user may be prompted as 
to whether or not to build another level of the hierarchy. This 
prompting may be accompanied by executing process 900 
(described below) to display to the user the current result 20 
built so far. Alternatively, a pre-defined system constant may 
determine how many levels of the hierarchy to build. 

When branch 636 is taken, the system executes a series of 
steps to add another level to the result hierarchy. Each level 
of the result hierarchy is stored in an Object Hit-List 350. 
The Object Hit-List 350 for the current level is created and 
initialized in step 620. Initialization involves setting all 
entries in the Object Hit-List 350 to null. In step 625, the 
relationships identified by the user in 340 (see FIG. 7 A) are 
used to select relationships in the relationship catalog 410 
for generating the next level of the hierarchy in subsequent 
steps. For each relationship 345 specified by the user, the 
Relationship Name column 425 is searched to find a match 
and the Selector field 420 of the matching entry 415 is set to 
1. The Selector field 420 of all other entries is set to 0. Note 
that in some preferred embodiments, the relationships can be 
default relationships, and query component 340 is not pro- 
vided by the user. For example, the default can be hypertext 
links. 

Step 630 begins an iteration over all of the relationships 
selected in step 625. Each relationship 415 marked as 
selected in field 420 is processed as follows in turn, by 
following branch 631. The relationship currently being 
processed, called the current relationship, will be referred to 
as 415a. When all selected relationships have been 
processed, branch 632 is taken. 

Each iteration of steps 630 and 700 produces one struc- 
turally relevant set defined by a given structural relationship, 
e.g., hypertext link, geographic location, category, etc. For 
each object in this structurally relevant set, if the object 
already appears in the current Object Hit-list 350, its entry 
355 is updated, otherwise an entry for the object is added to 
the current Object Hit-list (see FIG. 1). All of the objects in 
the current Object Hit-list 350 define a structurally relevant 
set based on a combination of one or more structural 
relationships. This structurally relevant set is defined for the 
given level of the generated hierarchy (hierarchy level). 

In step 700 (described in detail in FIG. 11), the Object 
Hit-list 350 for the current result hierarchy level being built go 
is updated for the current relationship. 

After all selected relationships have been processed, 
branch 632 is taken to step 635. At this point, a new level of 
the hierarchy (e.g. parent, grandparent, great-grandparent 
along a path) has been created and the objects in that level 65 
are identified by the entries 355 in the Object Hit-list 350 
that was created and populated by the steps executed in the 
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current iteration of 635. Each entry 355 has a Children field 
380 that identifies the children of the current entry. The 
Children field 380 points to a Children Table 580, which 
contains an entry 590 for each child of the current object. 
Each Children Table entry 590 has an Object Hit-list field 
595 that identifies the Object Hit-list 350a (the next lower 
level of the result hierarchy) that contains the child, a Child 
Number 585 that identifies the entry 355 in the Object 
Hit-list 350a that corresponds to the child object, and the 
Relation 575 that identifies the relationship by which this 
child belongs to the current object. 

In step 800, the entries 355 in Object Hit-list 350 are 
sorted according to score 365, and the ranking order is 
entered into Number field 360. The process then iterates at 
step 635. Note that in one embodiment, there is only one 
hierarchy level, i.e., only parents of the topically relevant 
set. In this case, there is no iteration step 635 and the entries 
800 can optionally be displayed 900 after step 800. 

In step 900 (described in detail in FIG. 12), the results are 
displayed. Note that one or more levels of the hierarchy level 
can be suppressed in the display, e.g., grandparents are 
displayed but not parents. 

FIG. 11 shows the steps for process 700 in detail. Step 700 
identifies and ranks parent objects for the given child objects 
based on the current structural relationship. The parent 
objects are entered into the Object Hit-list 350 for the current 
result hierarchy level being built (i.e., the Object Hit-list 350 
created in step 620). The given children are identified in a 
different Object Hit-list 350, which was either input in step 
610 (if this is the first iteration of 635), or was created by the 
previous iteration of 635. To avoid confusion, the Object 
Hit-list for the current result hierarchy level being built (i.e., 
the parents) will be referred to as 350'. 

Note that if this is the first iteration of 635, then the 
current child objects are the topically relevant objects iden- 
tified by the topical search engine. Otherwise, the current 
child objects are the structurally relevant objects identified 
during the previous iteration of 635, and the parents cur- 
rently being identified and ranked are at least grandparents 
of the topically relevant objects. 

In step 705 the N top ranking child objects from the 
children Object Hit-list 350 are selected for further process- 
ing. This done by setting the Selector field 375 to 1 for those 
objects whose Number field 360 is less than or equal to N. 
N may be hard -wired into the system, i.e., a pre-defined 
value, or selected by the user. In one preferred embodiment 
a value of 10 to 20 would be suitable. The value of N is a 
trade-off between quality and performance. As N becomes 
larger, result quality will improve since more child objects 
are used to generate the hierarchy. However, as N becomes 
larger, more processing is required and performance dete- 
riorates. Moreover, there are diminishing returns as N 
becomes larger, since lower ranked child objects are less 
relevant to the query. 

In step 710 an iteration over each of the top N child Object 
Hit-list entries 355 whose Selector field 375 is set to 1 is 
performed to find their parents (as defined by the current 
structural relationship). The entry currently being processed, 
called the current object, will be referred to as 355a. If there 
are more objects to process, branch 712 is followed, other- 
wise branch 711 is followed. 

In step 725, all of the instances of the current structural 
relationship in which the current child object participates (as 
a child) are selected. The Object Id 370 for the current 
Object Hit-list entry 355a is used to select entries 455 from 
the Relationship Table 450 for the current relationship 415a 
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(recall that 415a is the current relationship being processed 
in step 630, described above). An entry 455 is selected if its 
To Id 465 matches 370. Selected entries 455 are marked by 
setting their Selector field 473 to 1. The Selector field 473 
for all other (non-selected) entries is set to 0. 5 

In step 730 an iteration over each of the entries 455 whose 
Selector field 473 is set to 1 is performed to process each of 
the parent objects in the relationship instances identified in 
step 725. The entry currently being processed will be 
referred to as 455a. If there are more entries to process, 10 
branch 732 is taken, otherwise branch 731 is taken. 

In step 735, the parent object in the current relationship 
instance being processed is ranked and its score in the parent 
object hit-list is updated. The parent Object Hit-list 350' 
being created by the current iteration of 635 is updated for 15 
the current entry 455a. The From Id 460a for the current 
entry 455a is used to identify the appropriate entry 355'a to 
update by looking up 460a in the Object Id column 370'. If 
an entry 355'a does not exist with Object Id 370' equal to 
460a, one is created. The Score 365'a for entry 355'a is 20 
updated by applying formula F, which takes as input the 
current Score 355'a, the Weight 470a of the current rela- 
tionship instance 455a, and the child object's Score 355a 
(from the child Object Hit-list 350). The Children Table 580 
for entry 355'a is obtained by following the pointer in field 25 
380'. If entry 355'a is newly created, a new Children Table 
580 is created. The Number 360a for entry 355a is added to 
Children Table 580 by adding a new entry 590 with its Child 
Number 585 set to Number 360a, its Object Hit-list pointer 
595 set to point to Object Hit-list 350, and its Relation 575 30 
set to 425 (for the relationship currently being processed). 

In a preferred embodiment, a parent object's structural 
relevance score is computed by summing the weighted rank 
scores of its children objects, where children objects are 35 
those objects in the set identified in step 705 to which the 
parent object is related, as determined by Relationship Table 
450. A child's weighted rank score is its score in field 365a 
multiplied by the relationship weight 470a. To produce this 
end result, formula F multiplies Score 365a by Weight 470a 4Q 
and adds that value to 365'a. Note that when a new entry 
355'a is created, the Score field 365'a is initialized to 0, 

After all selected child objects have been processed in 
711, the parent Object Hit-list 350' will have been updated 
for the current relationship. The algorithm then returns to 45 
step 630 in FIG. 10. FIG. 12 consists of FIGS. 12Aand 12B. 
FIG. 12 A shows the steps for process 900 in detail. A 
hierarchical result display is created by processing the most 
recently created parent Object Hit-list 350'. In step 905 an 
iteration over each of the entries 355' in parent Object 50 
Hit-list 350' is performed, where each entry is processed as 
follows in turn by following path 907. When all entries have 
been processed, path 906 is followed. 

In step 910, process 950 (described in detail in FIG. 12B) 
is called for the current entry. 55 

FIG. 12B shows the steps for process 950 in detail. 
Process 950 operates on an entry 355 from an Object Hit-list 
350. In step 955, the current entry 355 is displayed by 
indenting and printing various attributes (obtained from the 
Object Catalog 210) for the object identified by the current 60 
entry. If the Children field 380 for the current entry 355 is 
not empty, then an iteration over the children identified by 
the Children Table 580 pointed to by field 380a is performed 
in step 960. At each level of the hierarchy, the relationships 
used for displaying children may be specified. If relation- 65 
ships have been specified for this level, they are used to filter 
the children that will be displayed by requiring that a child's 


16 

Relation 575 match one of the specified relationships. Each 
matching child is processed as follows in turn by following 
path 962. The current entry being processed will be referred 
to as 590a. If there are no more children to process, path 961 
is followed. 

In step 965, the Child Number 585a for the current 
Children Table entry 590a is used to identify the child's 
corresponding hit-fist entry 355 by indexing into the Num- 
ber field 360 of the Object Hit-list 350 identified in Object 
Hit-list identifier field 595a of entry 590a. This hit-list entry 
355 is then processed by calling process 950 recursively for 
that entry. 

The result of step 900 is a display of ranked hierarchies 
where children are shown grouped and indented under their 
parent. An example of such a display is shown in FIG. 13. 
FIG. 13 shows a sample output result of the system. The 
Figure shows the result of iterating step 635 once, such that 
a two level hierarchy is generated. The original topically 
relevant objects supplied in step 610 are displayed indented 
as 1320. The structurally relevant parent objects found after 
one iteration of step 635 are displayed non-indented as 1310. 
The parent objects 1310 form the next level of the hierar- 
chical view, provide navigational starting points for brows- 
ing the relevant objects, and group the topically relevant 
child objects 1320. The display provides the end user with 
insight into the structure of the object collection being 
searched. Attributes for each of the objects shown in the 
display are obtained from the Object Catalog 210 and 
Attribute Tables 250. 

Given this disclosure alternative equivalent embodiments 
will become apparent to those skilled in the art. These 
embodiments are also within the contemplation of the inven- 
tors. 

We claim: 

1. A computer system having one or more memories and 
one or more central processing units, the computer capable 
of accessing one or more database memories, a plurality of 
objects stored in one or more of the database memories and 
each object identified by an index, the computer system 
further comprising: 

a user interface for entering a query; 
a hit list, stored in one or more of the memories, the hit 
list being a topically relevant set of a plurality of 
topically relevant objects, the topically relevant set 
being selected from one or more of the database 
memories and ranked by a topical search engine using 
the index, the rank indicating a topical relevance to the 
query; 

a relationship data structure containing information about 
how two or more of the objects are related to one 
another by one or more structural relationships, the 
structural relationships being directed; and 

a hierarchical view generator that selects a structurally 
relevant set of the objects, each object in the structur- 
ally relevant set being a candidate parent of one or more 
of the topically relevant objects according to one or 
more of the structural relationships, said hierarchical 
view generator dynamically determining which of the 
candidates for parents are selected as actual parents for 
said query entered by said user interface, based on said 
one or more structural relationships, 

each of the topically relevant objects being a child of one 
or more of the selected parents, the hierarchical view 
generator organizing each of the child objects under 
one or more of its respective selected parents to create 
a directional hierarchy. 


03/23/2003, EAST Version: 1.03.0007 


5,875,446 


17 


18 


2. A computer system, as in claim 1, further comprising a 
display that renders a presentation of one or more of the 
selected parents and the child objects associated with the 
respective selected parent. 

3. A computer system, as in claim 1, where the database 
memory resides on one or more connected computers that 
are connected to the computer system by a network. 

4. A computer system, as in claim 3, where the query 
accesses the computer system from the network. 

5. A computer system, as in claim 1, where the hierar- 
chical view generator subsclccts a subset of the most rel- 
evant topically relevant objects according to rank as the 
topically relevant set, 

6. A computer system, as in claim 1, where the informa- 
tion in the relationship data structure includes weights for 
the structural relationships and the hierarchical view gen- 
erator subselects a subset of the structurally relevant objects 
as the structurally relevant set as a function of the weights. 

7. A computer system, as in claim 1, where one or more 
relationship data structures defining one or more structural 
relationships are selected using information in the query. 

8. A computer system, as in claim 1, where the structural 
relationships include any one or more of the following: the 
hyperlink relationship, the subclass or categorization 
relationship, the geographic location relationship, the loca- 
tion within a file system relationship, the conceptual 
relationship, and the is-a-part-of relationship. 

9. The system of claim 1, wherein said hit -list is organized 
into a hierarchical structure of parent and children objects, 

the connections in the structure corresponding to one or 
more of the structural relationships, and said structured 
hit-list allowing a user to understand a nature of the 
results of the query, and to find an appropriate object to 
satisfy the user's information need. 

10. The system of claim 1, wherein a topical search, 
responsive to the query, comprises a content-based similar- 
ity search where the query does not define a precise set of 
objects, 

the search engine computing the similarity of the query to 
each object in the database to generate a ranked list of 
objects ordered by relevance to the query. 

11. The system of claim 1, wherein said objects comprise 
data objects, and said data objects are stored in a database 
with relationships defined between the objects, 

said hierarchical view generator selecting and processing 
the relationships to identify structurally relevant par- 
ents from an original child set. 

12. The system of claim 1, wherein said hierarchical view 
generator includes an iterator, said iterator processing struc- 
tural relationships to build a parent hierarchy based on 
topical and structural relevance, 

said iterator selecting structural relationships in a query 
context and ranks parent objects based on their struc- 
tural relevance to the query. 

13. The system of claim 1, wherein the hierarchical 55 
structure groups objects for answering a user's query, and 
provides insight into the relevant structure of the database, 

to satisfy the user's information need. 

14. The system of claim 1, wherein the user's query 
selects certain relationships for finding structurally relevant go 
objects and for organizing and displaying the result list 
accordingly, 

wherein said objects are ranked based on degrees of 
topical relevance, and a subset of the topically relevant 
objects is selected for further relationship processing. 

15. The system of claim 1, wherein said objects on the 
result list are organized based on their structural 
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relationships, and display of objects at certain levels of the 
hierarchy is suppressed based on the relationships in which 
they participate, 
wherein objects in the topically relevant set are ranked, 
and the children thereof are organized under their 
parents according to a structural relationship. 

16. A computer system having one or more memories and 
one or more central processing units, the computer capable 
of accessing one or more database memories, a plurality of 
objects stored in one or more of the database memories and 
identified by an index, the computer system further com- 
prising: 

a user interface for entering a query; 

a hit list, stored in one or more of the memories, the hit 
list being a topically relevant set of a plurality of 
topically relevant objects, the topically relevant set 
being selected from one or more of the database 
memories and ranked by a topical search engine using 
the index, the rank indicating a topical relevance to the 
query, 

a relationship data structure containing information about 
how two or more of the objects are related to one 
another by one or more structural relationships, the 
structural relationships being directed; 

a hierarchical view generator that selects a structurally 
relevant set of the objects, each object in the structur- 
ally relevant set being a candidate parent of one or more 
of the topically relevant objects according to one or 
more of the structural relationships, said hierarchical 
view generator dynamically determining which of the 
candidates for parents are selected as actual parents for 
said query entered by said user interface, based on said 
one or more structural relationships, 

each of the topically relevant objects being a child of one 
or more of the selected parents, the hierarchical view 
generator organizing each of the child objects under 
one or more of its respective selected parents to create 
a directional hierarchy; and 

an iterator in the hierarchical view generator that treats the 
selected parents as topically relevant objects and runs 
the hierarchical view generator zero or more times, 
each time creating a parent level in the directional 
hierarchy. 

17. A system, as in claim 16, where a different set of one 
or more structural relationships is selected in one or more 
iterations of the iterator. 

18. A computer system, as in claim 16, further compris- 
ing: 

a display that renders a presentation of one or more 
selected parents in one of the parent levels and one or 
more of the objects, being displayed objects, structur- 
ally related to one of the selected parents and at a lower 
level than the parent level in the directional hierarchy. 

19. A system, as in claim 18, where one or more of the 
displayed objects is suppressed in the display. 

20. A system, as in claim 19, where the displayed object 
is suppressed due to its structural relationship with the 
respective selected parent. 

21. A computer system having one or more memories and 
one or more central processing units, the computer capable 
of accessing one or more database memories, a plurality of 
objects stored in one or more of the database memories and 
each object identified by an index, the computer system 
further comprising: 

means for selecting a topically relevant set of two or more 
of the objects; 
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means for ranking the objects in the topically relevant set; 

means for identifying one or more structural relationships 
between one or more of the objects in the topically 
relevant set, being children, and one or more of the 
objects in the database memories, being parents, the 5 
structural relationships being directional from the par- 
ent to the child, said identifying means including means 
for dynamically determining which of the candidates 
for parents are selected as actual parents for said query 
entered by said user interface based on said one or more 30 
structural relationships; and 

means for organizing one or more of the children under 
each of the respective selected parents in a structural 
hierarchy. 

22. A method of hierarchically grouping a plurality of 
objects stored In one or more database memories of a 
computer system, comprising steps of: 

a. selecting a topically relevant set of two or more of the 
objects; 

b, ranking the objects in the topically relevant set; 
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c. identifying one or more structural relationships 
between one or more of the objects in the topically 
relevant set, being children, and one or more of the 
objects in the database memories, being parents, the 
structural relationships being directional from the par- 
ent to the child, said step of dynamically determining 
which of the candidates for parents are selected as 
actual parents for said query entered by said user 
interface based on said one or more structural relation- 
ships; and 

d. organizing one or more of the children under each of 
the respective selected parents. 

23. A method, as in claim 22, further comprising steps of: 

e. deselecting the children and identifying the parents as 
children; and 

f. repeating steps c and d zero or more times, each time 
creating a next hierarchical level. 
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